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Abstract: Image processing is a method to convert an image into a digital form and perform some operations on it, in order 
to get an enhanced image or to extract same useful information from it. Face recognition is an important concept in image 
processing. The input of a face recognition system is always an image or video (movie). The output is an identification or 
verification of the subject or subjects that appear in the image or movie. Another important concept in image processing is 
facial expression recognition. Human facial expression recognition has attracted much attention in recent years because of 
its importance in realizing highly intelligent human machine interfaces. Facial expression plays important role in cognition 
of human emotions and facial expression recognition is the base of emotions understanding. In this paper, we propose a 
novel method for both face recognition and its facial expression. This is very helpful in situation when users want to identify 
human characters in movies and their facial expressions. 
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I. Introduction 

There are number of applications where face recognition [1] can play an important role including biometric 
authentication, high technology surveillance and security systems image retrieval and passive demographical data collections 
.it is observable that the behavior and social interaction are face recognition system could have great impact in improving 
human computer interaction systems in such a way as to make them be more user-friendly and acting more human-like. It is 
unarguable that the face is one the most important feature that characterizes human beings. By only looking ones faces, we 
are not only able to tell who they are but also perceive a lot of information such as their ages, emotions and names. This is 
why face recognition has received much interest in computer vision research community over past two decades. Figure 1 
shows an example of movie character identification. 

There are two main steps involved in recognizing names of humans presented in an image .These are- face detection 
and name classification [2], which are applied consecutively. In order to exploit uniqueness of faces in name recognition, the 
first step is to detect and localize those faces in the given images. This is the task achieved by face detection systems. 




Figure 1 : Example of movie character identification 

Sometimes we want to know the facial expression (happy, sad, smiley, fear) of the recognized face. Facial 
expression recognition [3] [4] is a process performed by humans or computers, which consists of(figure 2): 

1. Locating faces in the scene (e.g., in an image- this step is also referred to as face detection), 

2. Extracting facial features from the detected face region (e.g., detecting the shape of the facial components or describing 
the texture of the skin in a facial area; this step is referred to as facial feature extraction), 

3. Analyzing the motion of facial features and the changes in the appearance of facial features and classifying this 
information into some facial expression- interpretative categories such as facial muscle activations like smile or frown, 
emotion (affect) categories like happiness or anger, attitude categories like (dis)liking or ambivalence, etc. (this step is 
also referred to as facial expression interpretation). 
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fear anger disgust 

Figure 2: Different facial expressions 

Monitoring and interpreting facial expressions can also provide important information to police, lawyers, security, 
and intelligence agents regarding person's identity (research in psychology suggests that facial expression recognition is 
much easier in familiar persons because it seems that people display the same, "typical" patterns of facial behavior in the 
same situations), deception (relevant studies in psychology suggest that visual features of facial expression function as cues 
to deception), and attitude (research in psychology indicates that social signals including accord and mirroring - mimicry of 
facial expressions, postures, etc., of one's interaction partner - are typical, usually unconscious gestures of wanting to get 
along with and be liked by the interaction partner). Automated facial reaction monitoring systems could form a valuable tool 
in law enforcement, as now only informal interpretations are typically used. 



II. Related work 

In [5], [6], the authors proposed, the faces are clustered by appearance and faces of a particular character are 
expected to be collected in a few pure clusters. Names for the identified clusters are then manually selected from the cast list. 
In [7], the authors proposed to manually label an initial set of face clusters and further cluster the rest face instances based on 
clothing within scenes. In [8], the authors have addressed the problem of finding particular characters by building a 
model/classifier of the character's appearance from user-provided training data. An interesting work combining character 
identification with web image retrieval is proposed in [9]. The character names in the cast are used as queries to search the 
face images and constitute gallery set. The probe face tracks in the movie are then identified as one of the characters by multi 
taskjoint sparse representation and classification. 

In [10], the authors proposed to combine the film script with the subtitle for local face -name matching. Researchers 
from University of Pennsylvania utilized the readily available time-stamped resource, the closed captions, which is 
demonstrated more reliable than the OCR-based subtitles [11]. They investigated on the ambiguity issues in the local 
alignment between the video, screenplay and closed captions. A partially-supervised multi class classification problem is 
formulated. Recently, they attempted to address the character identification problem without the use of screenplay [12]. 

In [13], the authors proposed the facial action coding system (FACS) which represents the facial expression by a set 
of facial action units. In [14], the authors proposed an approach for analyzing and representing the dynamics of facial 
expression. Their system consists of locating of tracking the prominent facial features, optical flow analysis, and the 
classification. In [15], the authors extended the work of [14] by using connectionist architecture. Individual emotion 
networks were trained by viewing a set of sequences of one emotion for many of the objects. The trained neural network was 
then tested for the emotion recognition. In [16], the authors provided a facial expression representation by characterizing 
facial muscle activation. The facial motion estimation is operated by fitting the 3D deformable facial model to the face in an 
image for the muscle based representation. In [17], the authors developed a facial expression recognition method by using a 
synergetic pattern recognition approach. In [18], the authors proposed a facial expression recognition method to identify the 
shape of the mouth feature only. In[19], the authors used simple measurements (0 or 1) of the forehead wrinkle, eye opening, 
nostril furrow deepening, mouth opening, and eyebrow motion to recognize human facial expression. 

III. Proposed work 

The proposed work is shown in Figure 3. Our proposed work for the character recognition was motivated by the 
Bag-of-Features method [20]. The Bag-of-Features method extracts the feature points (i.e., image points that are described 
not necessarily by their color/intensity values, but by their local neighborhood based on, e.g., gradient information) from a 
set of training images. In the feature space, the feature points are grouped by a clustering algorithm. Based on the resulting 
clusters (all clusters together are referred to as code book and one cluster is referred to as visual word), occurrence 
histograms are then generated for each body part image. A classifier is then trained on the obtained histograms. Occurrence 
histograms reflect how many feature points are assigned to each of the visual word. Our approach is build on SIFT- [21] and 
CIE L*u*v* color-based code books that are obtained by clustering with k-means. A non-linear multi-class Support Vector 
Machine (SVM) is learned on the occurrence histograms. 
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The trained Support Vector Machines (or SVM models) are then used to predict the identity of the detected person. 
Probabilistic votes of connected body parts (i.e., body parts that belong to one and the same person) are combined for a more 
stable prediction. The training data is generated from an annotation data set in which the name of the corresponding 
character is noted for each of the body part. Based on the obtained annotation data, codebooks are generated and SVM 
models are learned. The codebooks and the SVM models are then applied subsequently on the entire video file. In this way, 
particular (human) characters are recognized at different points in time in a given video file. After obtaining a particular 
character, we need to identify the facial expression of that character. 

To identify facial expression, template matching is being carried out by making use of convolution and correlation 
coefficients for the highest and perfect matching. The desired eyes, eyebrows and mouth templates are being excerpt from 
the image and the extracted results are shown in the form of bounded rectangles. The Facial characteristics points (FCP's) is 
being computed by knowing the top left coordinate of each template bounded by rectangles. Once we obtained the 
parameters from FCP's we set the threshold value and then proceed for creation of Decision tree. A decision tree is a 
classifier in the form of a tree structure. Information gain (IG) is used to select the most useful attributes for classification: 
The entropy of total data set is calculated 
The dataset is then split on the different attributes. 

The entropy of each branch is calculated then it is added proportionally to get the total entropy for the split. 
The resulting entropy is subtracted from the entropy before the split; with the result is the information gain. 
The attribute that have the largest information gain is chosen for the decision node. 
A branch set with a entropy of zero is the leaf node. 
Otherwise, further splitting to classify its data set. 

The ID3 algorithm is run recursively on the non leaf branches until all data is classified. 
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Figure 3: Proposed Method 



IV. Conclusions 

The proliferation of TV and movie provides large amount of digital video data. This has led to the requirement of 
efficient and effective techniques for movie or image content understanding and organization. Automatic image or video 
annotation is one of such key techniques. In this paper our focus is on annotating characters in the movies, which is called 
movie character identification and their facial expressions. The movie character identification is performed based on Bag-of- 
Features method extracts the feature points and SIFT and CIE L*u*v* color-based code books that are obtained by clustering 
with k-means. To identify facial expression, template matching is being carried out by making use of convolution and 
con-elation coefficients for the highest and perfect matching. The Facial characteristics points (FCP's) is being computed by 
knowing the top left coordinate of each template bounded by rectangles. Once we obtained the parameters from FCP's we 
set the threshold value and then proceed for creation of Decision tree. After classification, we obtain the required facial 
expression of the identified character. 
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